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SYNTHETIC HISTORY FOR ADAPTIVE DATA COMPRESSION 

FIELD OF THE INVENTION 
5 The present invention relates to data compression and, more 

specifically, to compressing and decompressing data using pre-defined 
history-related information. 

BACKGROUND OF THE INVENTION 

10 Data compression algorithms convert data defined in a given format to 

another format so that the resulting format contains fewer data bits (i.e., the 
ones and zeros that define digital data) than the original format. Hence, the 
data is compressed into a smaller representation. When the original data is 
needed, the compressed data is decompressed using an algorithm that is 

15 complementary to the compression algorithm. 

Data compression techniques are used in a variety of data processing 
and data networking applications. Persona! computer operating systems use 
data compression techniques to reduce the size of data files stored in the 
hard disk drives of the computer. This enables the operating system to store 

20 more files on a given disk drive. Data networking equipment use data 
compression techniques to reduce the amount of data sent over a data 
network. For example, when a Web'-browsepretri^ a file from a web 
; server, the file may be sent over the Internef in a compressed lormstt: This' 
reduces the transmission time for sending the file and reduces the usage of 

25 the network, thereby reducing the cost of transmission. ; , • 

Many compression schemes use translation data dictionaries that ' 
contain a series of mappings between the origirial data and the compressifd ; 
representations of the aictual data. For example, the letter "A" may be , ' • ; , 
represented by the binary string "010." To ensure that data is decompressed 

30 accurately, the compreissor arid decompressor usecidentical dictionaries, M ' 
this case, the dictionaries may be supplied to each component (compressor 
or decompressor) or dynamically created by each component using known^ 
algorithms. 
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A dictionary typically is derived from the data according to a selected 
scheme relating tq various statistical information gathered th^efrom. su^ ' 
the frequencies of certain patterns in the data. For example:.the length of the 
bit representation in the.encoding table for ^each of the encoded data patterns 

5 maybeselected so that it is inversely related to the frequency of occurrence 
of the. corresponding, Pfttei;ns.,. 

Hereinafter, the term "text;\refers to a stream of data bits whicK is 
provided as a unit to the cof^pre^s^^^^ is not 

limited to.<wprd data.from a .do9yfnent image data and other types of data. 

10 As noted above, the text paniaypjeatures or characterisjics such as internal 

patterns'otdata..;;. .•:v.^:■.;• • , • . 

■ -rThere are several, welhKnown data compression methods which may 

be classified accprding.to hpw thgy generate and use dictionaries. Static ' 

compression algqnUhms.Use^?tatic4^^^ , 

1 5 affect. update.or othenwise^cl^^nge the ^ictionaiy for a given unit of text. 

Dynamic cpmpressipn.a.lg9.rithnns^^^^^^^^^^ other hand, constantly 

update orchange tiie dictionaiy,,a?^Qrdinig,to fM or characteristics of the 

text based om.a.selected scheme, ..In semi-static com^^^ the 

dictionary is occasionally,update^^p^^ 

20 a selected scheme.;.Mei;e.inafte.r,^tk^,^^^ 

refers to a dynaniic or sernKft^|ic .a^^ in which the history is either 

constantly or occasionally updated or changed according to data pattern 

variations encountered in the t^)^^^^ , . 

In adaptive, cpmpression.algprithms schemes, the dictionary is 

25 commonly referred to as a tiistory., At an^ given mpment Iri time dMring the 

compression process, the historyJs 4 representation of some or ali of the data 

that has been processed,by.the,CQrnpression,algprithm. An example using 

the;text:"when and ifldiere is to bc|.determined" (underlining added) is 

illustrative. .When,cpmpressing data, the compression algorithm checks each 

30 string of data to.,deterrnine whether that string already appears in the history. 

Thus, v(rtien the cornpression algorithm reaches the underiined "whe". the 

string, '^when^ is already in.the history. As a result, the compression algorithm 
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can reference (and, consequently, compress) the first the letters W- " where". 
From the above fit rnaybe observed that the compressibh ratio for the text : ^ 
depends on the nurnber of matches that are found in tHe history and on the 
length of the matched terms (given that a \dng term may be represented by a:'* ' 

5 relatively short representation). - ■ ■ « \ ^ ^ 

The data in the history may consist of ordinary text (as in the example. ' 
above). Alternatively, the history may consist of more sophis^^^ 
representations such as hash data or alinked list ciafe:'^ ' ' • - ^ ^ >^ *o 
. Adaptive algorithms have a number of adV^^^^^^ ' ■ ■ : ' 

10 these algorithms penriit the history be*acijusted*t6^b"e^^^^^ data-^-^ - -.c; : - 

patterns in the text. Thus, adaptive algorithms have, in essence, ^ "fearningV i 
capability. Furthermore, the history heed not riec&ssarily be along 
with the encoded data, but rather can'b^' fully VebOilt at the- receiving end from irj 

the encoded data during decompression: Thus; this class of techriiques is :^ 

y^'' ' ."s- ;^ ^ ^ ^ /I -vv* ^.'-n 
15 particularly well suited for data compression in' ia cbmm'uhication systent;:'. r ... ... 

Examples of adaptive data compressibh fe^ 

well-known Lempel-Ziv algorithms known; respec^^^^ - ' 

for constructing the encoding table (Ziv J?, Lemp^^^^^^^ ^ - 

for sequential data compressioh,^lEEE Tran's'a'btlbns^^^ Ihfomlatioh Theory.. 

20 Vol lT-23. (1977) p^^^ 337-343; Ziv J. , Lempe?AF6b 

sequences via yariable rate coding, ifeEE'^ransactibns oh ■ " ^ tr ^ 

Theory. \A)I IT-i4r(1978)^PP. 530-^36). " ^'''^-^^^ ■^'^ : : ..ov r v: . : . 

Waterworth (Waten/vorth J.R.: Data compVbssion systemTUS Pat 

No 4,701.745, October 20/ l&sb and VVhitin^^^^^ D.ll, Georg^ 

25 G.A.. Ivey G,E.: data conripreskibn apfiar^itus ah'd method;^ US Patent No n 

5.016^009, May it. 1991; Whiting D^rOebrge G:A.v lvey<G.E.: Data - : 

compression apparatus and method. US PatehtNd'5ii26^i^^^ 30.: 'i. 

1992) provide efficient implementations bf the Lehipef &-Ziv LZ77 technique r. ' 

for identifying data patterris in the text. A similar fast implerrientatioriis given! i 

30 by Williams (Williams R.N.i Ari extremely fast'Ziv-Lempiel datia cdmp^^^ 

algorithm. Proceedings Data Compression Cohference DCC *9i ;' Snowbird;^ : 

Utah. April 8-11, 199f. IEEE bomputer Society Prbss, Los Alamitos..CA...pp;\ 
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,„addl.ion:BreM(Bren.kP:rAr,nea.a,go*n,forda,,^ 

TheAu;ua,ia„Cp.puierJ0Uma,-^^-1«(19B7,.pp.e4^^^ 
.echoMue ma. We, advantage o.bo»h'a77a„d*e Huffman en,^^^^^^ 

D/. A memodWmeconsWdUofrof minimum re^^^^ . .. , 

Proceedings iRE, Vol 46. (1952) pp: 1098-1101: , 

lougMhesevi^^^^^^^^^^ 
successful^ employed i;Snrappiic.«onsvlhere.^ 
,0 improved compression . ecK/ii^bWpMSulaHy in data ne^»P*^^^ . 



applications. 



>headapti>;ec6rnp^^^^ 
, p.e.imina^histo^.a^thatis.^ai.ab.e^Whenco.p.^^^ 

a given text This P^BliminJ^V^W^^^^ . 
history-relate^i infom^^tioKfeferf^^^^^^ herein as synthettc history data. By^ 
providingpre.imihaVhlstbH;'date.^heh,ethod.^ 
probability thatyr^v^Hllbe^ihci^ased^^^ 
0 text and me history file d^fe dUHng the compression process. In particular, 
more matches may occur for the portion of the data in the text that .s 

processed during the e^it^es of the c^^^ r 
■ This advantage bfWi^^^^^^^^^ 

comparison with convlnttolii Adaptive cdnipression techniques that 
>5 commence compressing a given text with an empty history file. In the .. 
conventional c^ «1^ «^er a match f^^^^^ very first element of data .n 
the text that fe p^6ofe^^^B9 fiiecbmpfesslon algorithm because th^re in.no. 
history data! Mdrebvef. th^r4 typi6ally will be few matches of the text ?nd the 
histor^ data until a VeikiV^^^^^ a^^^^^d to the 

30 ' hiStoryfile: in cdntrasti thfe present Invention provides :m^^^ , 
' ' compr^ssibn because the m will be compressed to.a higher degree due to 
ihe increased number of m&tches' between the-text and the history file data. 
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In one embodiment of the invention, the data for the synthetic history 
files is defined B^fedHSn an expected correlation betweep a giv^ te)rt to be' ■ ■ ■ ^ 
compressed ar^d a predefined body bf^text is .related in some manner to the 
text to be compre'ss6d; Given two texts, of ja similar ^ype (e^,; articles written' ■ ' ' 
5 in French), it is expected that there will be, a , certain correlation betv^een words " ■ 
appearing in two or more of the these texts,, It may be understood; then, that 
a wide variety of category types may be defined, For ^xample, books that ■ - 
deal with environmental engineering,mML,documen^^ "^f^]^'^ ' " ' 

■ In one embbdimeht. the synthetic.histoiYjnfom^^ " 
10 "typical text." The -typical text? is defined as a t^^of a ^ 

that contains strings which are most likely to appear in a documehrof ay • 
certain type. For example, an HTML (hyper-text mark-up language) " ' ' ■ 
document will likely contains subh strings as-rtl^^:|;N^I^>",and "IMG SRC=". 
' The comprebion algorithm processes 
15 preliminary histbfy data. -When a ta^ 

becompresi^d:th^as^ociated..prerdefined,-^ypic 

beforpcompressihgthe target text.:iJhis ciieate^et^limi^ 
isusedtocomWeS'^theta-rget'te)ct.nlnrth|s,^^^^^^^ ' ' 

compressed: there may be a^high prob^bility th^^^^^ 
20 in the target text are ali-eady present-in the history , and. therefore j may be 

referenced. ' - ' ' ' ■ - '>':- ^io'noq i^,itiorv,;or . , 

In one embodiment, the conTpresse^i,d?rt^Jhiatre^ ¥® ^P'^!! 
text is discarded- ' This elirilinated^the n?ed,.fprje;)(am^^le^ transmitthis data 
to a remote equipment where the, compressed t^rg^text is to be 

25 decompressed: '■ ' " ' ^■•-.r : . i:}'- ■>';!■'/!.,< „.■: ^:''<cru -r.if.c:;, -km;?,;-/..-. 
' - When the compressed target text is to be. decompressed, th^^^ 
decompressioTi algbrHhm performs steps-comRl^fp^ptao^^^^^^^^^ 
above. ^Specificaily.'tKe deoompression.algorimm fir^t decompresses a ' _ 
compressed -tjrpical text- thereby?creating . a preliminary ^W^^^ 
30 decompression algorithm decompresses the cptripresseid target text usirig ^he 
preliminary history. The decompressed text that results frorn the conipressed 
•target text" Is then discarded. Thus, the depompi;ess|on s^age produces an 
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accurate replica of the original target text. 

,n one embodiment, the invention is implemented .n a pa.r of dev.ces 
installed inadatanet»sUchasthe.nt.met.Fore^^^^^^ 
a,ay be installed^ between , pa.f of routers that define an IP.hop. The de ce 
onll.esendingendofthehbpintercept.eachpapketthat^^^^^ 
overtheWanadete^ineswh^herthatpacketcqntainsatypeoftext 

may be compressed usirtg^spheti^ - _ 

'^Thedeviceohthe'6teehdbf:the,hopalsqi^ , 
packetand d^errhines^h^therth.tpacketwas-pompr.ssed using synthefc 

, history/fsbl^H^ packet is decompressed using syn^ 

■'■ BRIEF DESCRIPTION OF. THE 
These eihd'kherAfeafares of the invention will become apparent from 
5 thefollowirigdescriptionaridclaimsvwhentak^^ 

drawings, Srtiereih sirt^ilar>efe.^nc.s.characters refer to sin^ilar elements ^ ^ 

throughout and in which; '' ' ' ''' "^^^^ ' - 

"'■FlG^REiisalblockaia^^^ 
comS^ssibn^hd decompr^S^ibh system in accordarjce with the invention: 

20 RaURE'2isafl6v^^^^^ 

compr^ssi^K system i?nplemented abcording to the/invention; 

FIGURE S is"a fl6Wchar€6f operations that may be.perfomied by,a 
decompression system implemented the invention; 

• ' FIGURE 4 is a block diagram of, one embodiment of. a computer 
25 configiirfed to ^jerforoi corhprfession and/or decompression, methods according 

to the'iiiventibn; ' '■' ' ■ »^' ';r'- ' . 

FIGURE 5 is a block diagranvof ^one embodiment of a data network^ 
system ihtjorijbratirig cdrh'pfessioh and decompression in accordance with the 
inv6ntibnrahd '■' " ■ ' 
30 ■ - FIGURE 6 is a block diagrarti of another embodiment of a data network 
system incorporating compression and decompression in acoprdance with the 
invention. ' " ' ' ■ ; . 
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" bnSdRIPtlbN OF EXEMPLARY 
FIGURE 1 Is a block diagi^am of a data compression system C ,(top half 
5 of FIGURE 1) and a data decompression system.p (bottorTi,hal( of FIGURE 1 ) ^ 
in accordance with one embodiment of the invention. .frle%a first processor ^ 
20 executes a data compression program- 22 that, uses synthe^c history to 
compress input data. ' As represented by the, dashed Jin? ,24;. the compressed 
data is sent to a second processor 26. The.seQO,nd,prp,ce.ssor 26^ a 
10 data decompression-program 28 that uses.synth,etic.,hi?t9ry,,t9. d^ 
the compressed data. 

The operation of the components of FIGURE 1 may be better 
understood by reference tb FIGURES 2 and,3.";FI,GyRE 2 is a, flowchart of 
one errlbodirrlent of operations that may be perfora^gd by the ^^ta . . 
15 compression stage C. FIGURE 3ms a fiowchai!t^pf;0,ne eijri^o.djm|nt of _ 
operations that rhaf be perfonned by the data decprnpregsjpn , stage _ 

The method of FIGURE 2 commences at block ipO. As some point in 
time prior to the' beginning of the eompre$sipn.process,. synthetic hi^^^^ 
is cre^fed for each data-type"' that is to be . compressed .usirig sy|ithe« 
20 compression (block 1 02). The^synthetic hi§to?y data fq^ thelata.^type^ js 
stored in one or more' synthetic history data files . 30 ii) adata mem ^ 

As discussed above,- the datato be:cortipr.essed (e^.g^., the "text;' . 
referred to in the'Background section) may repxespnt various, m^^^ , . 

including, for example, cdHventibnal charactercjte)<t,^im,age data.-video dpta or 
25 audio data. In addition, much Of this data may be: classified as a P?[*i?!^'^f- ,0-; 
data type; for example, language type (French, English, etc.) or dpcum^nt., . ^ 
type (electronic mail, JAVAT", HTML documerits. emcMtable code, etc,). 

' In one embodiment of the. Invention,, the synthetic history .file 30 for a,. 
given data type contains a collection of characters and strings thsit jare jikely^ 
30 to appeiir in data of that type. The synthetic. histpty data (e.g.r,the "typical 
data") for a given data type may bechosen using statistical metjiods.^ l;or . , 
example, character or string probabilities for a given type may be generated , 
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by anaiyiingVfarge number of flies of that type. 

At the beginning of the compression process for a g.^en^sft of data, 
the processor 20 (RGURE^I) receives input data from an inp^t^ata source 
34 (block 104). Twotypes of input data- are typical: static filp data or 
5 streaming data: In the first cas6 (described in more detail below .n 

conjunctibnWitfiF.GaRE4)lthe.inpatdat.souroe.34mayb^ . 
data memory such as a disk drive or random access memon. (RAM). In the 
case of streaming d'ata-(destribed in more^detail below in conjuncfon w.th 
FIGURES 5 arid 6);the' infiOt data" source 34 may be ,a data interface dev.ce 
10 that receives, for ekampleUtreaming packet data fr^^^ 

At block 106. a data #e identifier routine 36 analyzes. either the mput 
dat^or^inforrtiktioi^ as^bciated^with the input data^o ascertain the data type of 
the ihput^data: For example Hrt^the first c^se the data typ^ identifier 36 may 
perf^rm^rBlativeiy-fa^tanalyfeiSi^f the inputdata.. This may in^^^^ 
15 searching for strings that commonly appearat the beginning of particular, 
documents. For example, an e-mail file may contain headers such as "From" 
and "to-: A Microsoft^-^ NA/ord^^^document may contain^a common signature 
such ks ''WbrdD66ument^ Alt6matively.^.the idenP 
involve'se^rbhin^ foreword th^jsvery^n^^^^^ 
20 nofirfbther languages: ■"' -i'-' A'.' ^h'-b ? ■., r :•: ■. •,• ,,■ ■., ■ v- 

■Ofie example of the seednd, ca^e relates to compression of streaming 
packet data in an Internet environment. Here, the data type identifier 36 
ascertains mtCP pdrt W which the incoming pacKet^ata is associated by . 
Analyzing the header information in the>packet. TGR pofts are defined by the 
25 TCP/iP prdtocoi 'that is used by many applications to route data over the 
Ihtern^t.^ In tKe ihstant case- the TCP port number may be used to identify a 
data type vi^eh It is-Rriown that a specific.type of file typically originates from 
a particular TCP source port. As an example, the compression device may 
bW configured to compress data originating from a particular server. That is. 
30 all traffic'fi-om thfe sen/er may beiroiited through .the compression device. In 
this case, it may be possible'to predict the types of data that typically are ?ent 
by the sefveKon a particular TCP. port. From the above, it should be 
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understood that many other methods of data typeiidentification may..^^^^ 
in practidrig the invention. .... , ,.. .w ; .. r.-. 

After th^ b^ta'type has been identifi^^ 38 . . 

retrieves the assoaatiia' -typical aala-s0.e;.-^the:synthetipJi!lsto.ry date ). from 
5 the data memory 32 (block 108). Next- the data compressor 22 compresses 
the "typical data" (block 1 10) and discards the resulting compressed data 
(block 112). In conjunction Mh theSe. 8teps;.the-resy.ltii)ahlsto.ry.aate . . .. ,, .. 
(preliminary history data) is stoVe'd iH a' history filei4fl,,that will; be used during 
the compression 'oftft'e input data." ThUsv atthiS: Stag?^pf.the ,C9mpress|on ., , 
10 process, the history' file 40 has been pre-loaded, with histor/ data;, , , , , 
representative of the "typicai data" for the input d.ata :type. . •; ^ , ,., 
The input data may now be compressed, .using the preiiminary,,.!}^^^^^ 
(block 114). In ohe emtiodiment: the data ccnipressor 22 use? a Uempel-Ziv^ 
compression methbd. It should be understood; hoyy,eyer,,tl]3.tot^^^^^ 
15 of data compression such as^ adaptive Huffma.n coding .may-b^^ in. 
practicing the iriventibn'' ^"v^ri:^;!;-?: ... 

At blddk 11 6. a reference^ is associated withithe compressed data. _ 
This reference is used' by the d^cbmpression-'§tage D to determine ,. , 
synthetic history to u'sel'Wheh dbcbmpressingittjie compressei^ data. For ^ . . 
20 example, the reference may identify the data type. In the Ijitejngt tx^rpple ^.^ 
discussed abov^, the reference' may be^lnserted into the header of a packet 
within which' the compressed' data is transmitted. : < i f . - ; i , , , j ., i , . . . 

At bibbk 1 18, the prbcessof"2^ sends thescompresse^l data \9-,^-,:,'..^^ 
output date destination 42 and the process ends at b!pck.12Pf, As^l?pv^,|he 
25 destination 42 for the butput date-may be? forexamplei a date file , or a^data ,^ 
stream. Thus, the output data destination-42 oiay^cpmprise. pt ^xamp!e,^a,, 
data rhembry or a data interface deviGe>as described, below in..conjutirtiQ,n.^ ,^ 
with FIGURES 4, 5 arid 6.; - ^ A ;k > --m. ... /y ' . ■■ feitt 
In the embodiment described above, the output data incUides . only |he 
30 compressed input data and. if applicable, the reference. TJie output data , 
need not include the compressed typical data" oothe,history ^late.- 3, . . 

Referring to FIGURE 3. a decompression rnethod is described , < 
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beginhirtg ^tblobk ISO-^At block, 152. synthetic history data i^ created for 
each dat^ type that is to bedecompressed using a syntheti(^^^ 
step is i^erformed in a similar manner as described above in ponjunction with 
block 102: except that the i-typical data" is prercompressed using a . 
5 compression algorithm th.t is compatible with, the: algorithm us^d by.the data^ 
compressor 22 described above; ^Thus. in this embodiment, the 
decbn(pressor-s synthetic hist6rY^data files-44 contain -typical compressed 

data.""'^' ■ ^-y-"^ - •■ • ■ 

At the be^ihnihg'ofWe decompression process ^or a given set of data. 

,0 the p^ocessoh 26 (FIGURE^^ receives compressed irjput data from an input 

data^^^tirce 46 (bldck^l 54) vAs^tated above in conjunction with blpck 104. , , 
two types of input.data. are.^^^^^^^ streaming data. Thus, the input 

data saWrce 46= may comprise components. similar^to. those discus^d above. 
At block 156. a data type identifier routine 48,,?.^alyzes.the compressed 

15 inpia iiata br informatioh associated with-the inputdat^ to determine the data 
typeoftheinputrdata.:Forexamp^.th^-clata,type^ldentifier48may,de^^^^ 

the data tybe by readihg alreferenpe that .w^ the data as discussed 

above in conjuhction with block 116,' v. . - . 

After the data' type has ibeen,identified, a data selector function 50 . 
retrieves the associated -typical compressed data" O.e.^ history 
d^ta) from the data memory 52,(block 158). Next..the data decopipressor 28 
decompresses the %pical^ompressed data" (block .160) and discard s^he 
resuitihg decoW,press6d data Pockr162).^ with these steps, the 

resulting history data (preHminarW^^ history file 54 

25 that wilfbe used during the decompression of the input data. Thus, at this 
stage of the decompression process, the history file 54 has been pre-loaded 
with history data reptesehtative of the.'typicaUompressed data" , for the input 

datta'typb:''""^'' ■ t. ■ :- i.;-;, :■ ; ■.. - .. _ 

~ ' At block' 164'/ the:data decompressor 28 uses the preliminary history to 
30 decompress the Icompressediinput data.^ To this end. the data decompressor 
28 use§ a decompression algorithm that is complernentary to the 
compressidii algorithm used by the data compressor 22. 

> ' 10 
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At blocl< '16B?tli s6nds the decompressed data.(whic^ Js 

a replica of the ongihaf data) to an output data destination 56 and- the process, 
ends at block 168\ 'AW a^^^^ output data may.be,Jpr: . ^ . 

example, a data file or a data*Eitream: Thus/the output data destination .56 , . 

5 may comprise components similar to those discussed above.. , . , 

In another embodirhenl of the invention; the synthetic history data used , 
by the data compressor 22 may comprise actual histqry data. , For example, in 
this embodiment, the "typical data" is not stored in the synthetic history data 
file 30 as discussed above in cdnjuricti&n with blocte1;02 Ja F . . 

10 Instead, thfe history data that Would result from:the cornpression of tt;ie/typical 
data" is stored in the syhthetib History data file 30v ' This; his;tpFy, d^ P^V,^,? ^, , 
generated, for'example^ by 6)cplicitly defining the :hi^^ „ 
compressing the "ty^icaj data": In the- latter case; the history data i$,s^^^^^ , . 
and the compressed "typic&t data" discarded- v. , ...^ , ^ 

1 5 When iriput 'datals to be compressed - ithestep'Ofselect^^^ . 

data (block ld8)'mV siniply involve copying^^thei^ 

synthetic history daSa file? (e.g;, file 3G) to'the^histoiyfile that Js used for .. . 

compressing the input data (e.g.. file 40). In addition, the reaMime steps 

compressing tKe "typical dlta"^ (blobk- 1 1 0) and-^discardipg the compressed 
20 "typical data" (block' 11 2f are bmittedv vnu^nc : jv. ; r ;.. ^ y^o^-rs b-^^-- xrs: r 
^ the dieconripressioh process may B^: modified irj a simyar manner, Jhe 

synthetic histtdry data used by the data decoFrvpr;essor28 may coni|Rt;i3^ . . 

actual history data. For example, rather'than storing the "typical cp^ 

data" in the syhthetic history dka file 44.a& disGuss0d;^bpye iri conjunpti9r)_^„ 
25 with block i52 in FIGURE 3, the history data lhaf would result from . ^ . 

decoriipiression of the "typical compressed data"^ ; 

synthetic history data file 44. "This history data^ 

example, by explicitly defining the history data or by actually decompressirig.i , 
the "typical compressed data". In the lattet^ case, the history data is saved 
30 and the decompressed "typical compressed data" discarded/ r i . .v:.; o-/ r 
When input data is to be decompressed in this embqdirnerit, the step 
of selecting the typical data (block 158) may sirir\ply involye copying the . . 
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history data frpm the synthetic history file (^'g.. file 44) to the histdry file that is 
used tp'decompress the ipput data (e.g.. file 54). In additldhi^the-'rfeal-time 
steps of decompressing the "typical cornpressed data" (block 160) and 
discarding the deppmpressed "typical corfiprdssed data" (block :162) are 
omitted. 

From the above, it should be ur^derstood that synthetic history data 
may include, for exarTiple; data ttiat may be used tb generate history data, 
actual history d^ta that is the product of a compression or decompression 
operation, or predefined history data. ' 

.In .another'.embpdiment of the ihVehtion. synthfetic history data (such as 
the compressed,-.typical te)d^^^^ the data cortipressor stage C. . 

to the data decompressor stage b. ' In this example, the steps described in 
blocks 1 1.2, 1 ,1 6. 1 52. 1 56 and 1 58 may be omitted because the data 
decompressor 28 may sirnply us^^ the corhpfessed "typidal Xen" (generated at 
15 block 1.1 0.).^at ^fock 160. In practice; this techniqiie'ifiay be I6ss efficient than 
the pi;eviously de!scribe<^ techniqu^^i^^^ Nevertheless, it 

should be.understood from the abb^e that rriahy adaptations involving the use 
Of synthetic histories are possible in practicing the invention. 

FIGUBE 4 illustrates sornei of the c ' 
20 incorpprated intp a device 2^^^^^^ data' compression and/or data 

decprapression In accprdaince wUh the'inv/entiori. ' A processor 202 executes 
prograrn code (npUhowrjys^red^i 204 to perfonn, for 

example, the methods described herein' in bpnjunctidh v^^^ 1-3 and 

5-6. Typicaliy. the program mernory 204 cornprises ia read only memory 
25 (ROM) device or a semi-permanent data memory such as a flash memory. 
The computer 200 also includes at least one stbrag^ memory 206 for storing 
(^ynatnjc data, T^^ the storage memory 206 comprises a random 
access memory (RAM) device or a disk drive. 

The program code ma^y be pre-loaded Into the prograrh memory 204, 
30 for example, at the factory. Altemativeiy. in embodiments that are connected 
,to a data network such as the Internet, tlie program may be downloaded from 
a server via the data network. 
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In anothier embQdiment, the program code rnay be stored on a 
removable mecjiat2p8.such as a CD-ROM or g floppy disk: 'InlHis c^^ 
computer 200 would include a ,renioyab|e media drive 210 sucB astb-ROM 
drive or a floppy disk drive, The prpgrani code may then be ddwriloaded into 
5 the program memory 204 or. in some cases, accessed directly by the ' 

processor 202 frorn the renriovablerTiedia 208. 

One or more data interfaces 212 may enable tKe cdiliputer ^00 to ' 
send or receive data to or frpm external .devic^■ (not ^s^ ' • 

include the program data, the original data, the compressed 
10 decompressed data^-Exarnples of 'ifta,'';^erfaces |l2 incju^^ '^y' ' " 

paralielports. bus interfaces, or dafa network jr^^ 
example is.discussed in mpre detail belqwin }^^[^^ ^ 

K The;,teachings of .the invention may be used for file compression ^ 
15 schemes that.attempt to Mse disk drive spac^ more efficiently ^ 

in a compressed-format on^th^sv^tem disk^^ ^' - 

used.. for example.-by a .cpmputer operatog sys^^^^^^ ' "' 

embodiment of FIGUKE f in jthe ^oViawngms^ ' ' 

includes an operating^systenij^istelied h theWgraf^^ 
20 executed bythepropessQr2(^.Jhe operating 

synthetic history, cornp^essiona^d dec^ *^ereiiir' 
Thus, Jhe operating systern niay cptTip^ 

system hard disk drive, (e^^^. storage metri^ 

system may decompres? files after they^are 

25 drive. , : .-r ,y : r ^ i-:.,---,,- lu-.-'u.. y. r. 

FIGURE 5 illustrates ^qneembodirnem 

compresses streaming data on^he-fly,; SpecM^ 

in equipment installed in a path in a date network: 1^^^^^ ^"^"'f 

incorporates, features and funcliqnal elements similar to those of the ' ^ 

30 embodiments dqsicribed above to' accon^plisKd^^^^ 

decompression using synthetic histories. 

Packet-based data networks (such as the Internet)" trari^fer infont»4tioh 
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between computers.and other eijuipmeht using a data transmi^Sibri format 
known as packetized data' the stream of datia from a data s6urcd'(«;g.. a ^ 
host cornputer) is divided into vark^^ 

packets). Switches (e!g^^ routers) in the netwoYk route the^ packets from the 
5 source to the'appropriate dafa destihation: In many cases, the packets may' 
be relayed through several ro6ters before they reach their destination. Ohce 
the packets reach their destination'; they are reassembled to regfenerate the 
stream of data. 

Conventional packet-bas^^^ variety bf prbtbcols to 

10 control data transfer throughbut'a netWork. For example. "tHd internet 

-Pfotocol ("IP") defines procecfures for routing data through a network. To this 
end. IP specifies that the daia is or^^^^^ frarri^s each of which includes - 

an IP header and the assoctatecild^^ the network use the 

information in the IP header tVVorward the'packet through the network; In the 
15 IP vernacular.' each route^to-routef (cir sWitch-to-routeVi 6tc.) link is referred '^td 

as a hop. 

In FIGURE 5. a router 2^ 
packets to another router kz^'Wtlhe bthir e hop. Sortie of the 

packets serit over tlie hop may be" asS^biaifed with d at^iy pes that can be • - 
20 compressed using pre^iefin^d syHth^tic' historie^^ (not shown). In accordance' ' 
with the invention, a compressor 224 compresses the data in these packets 
using the syntHetic tiisbn^s":' dkt^^^^ of the pkh, a decompressor 

226 decompresses the data in thfe compressed packets ijsing pre-defined . 
synthetic histories. 

25 In practice^ the link betv^eeii^t^^ a ' 

pennanent or tempo^ry iinlc The link may be used to trktisfer unmodified 
layer 3 protocol packets. Layer i is a network layer protocol and 
encompasses, for example, the Internet Protocol ("IP") and those that 
conform to the OSI ("Open'Syistem iriterconriebtion") reference modfel. 

30 The compressor 224 pi'bbe^se^ ^rt inbound stream of packets from the 

router 220. One or more hetwori< Interfaces 228 in the compressor 
terminates the packet protocols arid provides the packet data to a processor 
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230. When the^ideyices (i.e., .the compressor 224,and the decompressor 226) 
are installed between^the routers. 220 and 222 as illustrated in FIGURE 1, the 
network interface 228 ^connects^o vvide grea network ("WAN"). as described 
above. In some embodiments,. the compressor 224 may be installed farther 
up the link (i.e., befqre the router 220),. In this.case, the network interface 228 
may connect to a local area network ("LAN"). The rietwork interface" 228 in 
the latter type of. system will include a l-ANrtype interface such as an Ethernet 
interface. 

The processor 230 perforrns d^ta compression af;id other processes 
such as those as described above in conjunction with FIGURES 1, 2 and 4. 
To reduce the complexity of FIGURE 5, the data memories and other 
components associated.with the processor 230 are not depicted. 

After the processor. 230 compresses the, input data, the processor 230 
sends the packets to the network interface. 228. The network interface 228 
15 processes the papket data and. provides the appropriate physical and data 
link layers to interface to the network (as represented by line 232). In 
practice, .separate input andj.output.netvyork interface connponents ma^ 
used in the compressor 224.. The details^ of, the pperatipn and implementation 
of the network interfaces 224 as described are well known in the IP data 
20 networking art. Accordingly, these ..aspects .of the compressor 224, will not be 

treated in detail here., r. . _ . , ^ ^^ 

As . represented. by; the |in.e.?32 in .[:IfiUi^.E^^,.pack^^ from the 
compressor 224 are routed, over the network to the decompressor 226 on thei 
other end of the path. A network interface 234 terminates the physical and 
25 data link layers and provides network layer (IP) packets to a processor 236." 
The details of the operation and implementation of the network interface 234 
may be similar to those ^aspects of the network interifaces (e.g., Interface 228) 
discussed aboye. . ... 

The processor 236 in the decompressor 226 performs data 
30 decompression and other processes^such as those as described above jn 
conjunction with FIGURES 1 , 3 and, 4. To reduce the complexity of FIGURE 
5. the data nriemories and other cornponents associated with the processor 
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236 are ribt'dejaiGted; ^ ■ ^ ? ' ^ . - i ■ f - . : 

After the processor'236 decompresses the input data,4he processor 
236 sends the packets to the network interface 234. The netyyprkjnterface , 
234 processes the packet data^and provides the appropriate physical and 
5 data link layers to interface to the.network.^The data is, thea forwarded to the 
router 222. 

FIGURE 6 illustrates ah ehnbodimeht in which.the compression and , 
decompression methdds-of the invention^ are integrated as software modules 
in devices 240 that are installed "at each end of a predefined .path in a , 

1 0 network. The devices- 240 may-be; routers, bridges, switches, modems or any 
other devicg in the network thattiandles;packet traffic. : , : . : 

The packet compressidhfahd decompression operations performed by 
the embodiment of FIGURE 6 are simitar to those described abpve.in 
conjunction with FIGURES; 2-S, -Gortipression software modules 242 and 

1 5 decompression software modules ^244 are linked, to software, rnod ules 246 in 
the devices in ^a manner thatienables the compression; software modules 242 
and the decompression software module5:244' to interceptand^^^ . 
packets routed through the deviceis 240.= ;To. reduce the comp|e?(ity of 
FIGURE 6, other components in the devices 240 such as data.memories and 

20 network interfaces are hot deplctett: ^';i> ; • . : : , . - j 
Typically; thie^cdnrtpressibn 
and 244 may be implen16hted alc5ng the transmission path in a deyice 240 
where the packets kre fully visiblei F 

flowing thrbiiigh the ^rietwdrk'mayibe encrypted: Thus, the compressor and 
25 decompressor softvvare modules 242 and 244.may be linked in to the device 
modules 246 sd'that this compression and decompression software modules 
242 and 244 have access to decrypted data. - , „ v 

In FIGURE 6. the compression.'and decompressionjmodules 242 and , 
246 are installed on both sides of a duplex link. Accordingly, packet traffic 
30 traveling in either direction on the link may be compressed according to; the 
invention. ' ^ '^-^^^ ^ <^ • --^r^.r.i ; > --^ ^ 

FIGURE 6 also illustrates that the invention may be used on more than 
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a single IP hop. In FIGURE 6, the packets are routed through a network:248- : •;. 
(e.g.. the lritemet)'ahd?as-a result/they m ovenseverahhops. In;.:, 

this case, appropriate fou be. made to ensure.that all, V:\ ->; 0 :/ 

compressed packets are routed to the same receive module at the ;pther end 
5 of the path. This hniaylhclude. for example, defining static routes' using IP . ; ; ; ^ ■ 
tunneling. ,f - 

In the compression scheme-above.; it is.impprtanttp maintain, the 
reliability of the link when multipfe packets are to beiporppr^ssed or > ^ . ^ 
decompressed using the same historyifile.- This js bepggse. in order to t ; ;vv,v ; ■ 

10 decompress packet "n/' the decompression moduler2f4/.mu,stfi^^^ . ^ 7 . 

decompress packets "1" through "n-1." . Reliabilityi0iria¥.be proyi^^^ v.,?,:. 
reliability mechanism -associated; with TCP-.^WDLG;^^^ .-f: 
PPP ( in its reliaible mode).^ ' * '• ■ ^ - ^) ;-Kr 

In many of the ernbddiments described abo\^, ya:riousjnitiali2atipn--,,.; :s 

15 procedures ma/ be performfed; F6r example- airhistory files. may be erase^d ^ 
and various comprfessidn parameters: may^ be excta^ paired, ; 

compressor arid decompressor; In the data networking, embodimentSv:the^ ? . 
initialization procedures may be accomplished using a. refatiyely 
way handshake such as the^one used in TCPv n; a v^m o,. ^ \ v ; 'v;;0: i 

20 From the above, it may be seen that the ioY^ntipn, provides, an • , v. ^ 

improved method of compressing data and inqre^isiDg.clata^^ ? . . 

network. In many applications, a'^system.orcm.etboj^^^^ . v^ r ^ Dr .: 

implemented according to the invention will find history data that matchesJhe,. . , . 
data being' compresised earlier in the conipression process than qpnypnfiQn^^^^^ 

25 compreissioh methods-. 'As a result, data can be compressed. mpre.quipkly ^:v^-^:>^^^^^^ 
and to a higher degree of bompression. In: particular;; a system.or method ; 
constructed or implemented according to the invention may^proyide i > ; ; u > c 
significantly higher compression, ratios for relatively small data Jiles (e.g., files 
smallerthat ten kilbbyteis). ^ - ^ ^ - s : ■ r , : » : ■ 

30 While certain specific^embodimerits of the invention are disclosed =as i?. , . - > 

typical, the invention is not limited to these particular fomns, but rather is - ; \?y; 
applicable broadly to' all such variations as fall within the scope, of the . .. 
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appended claims. To those skilled in the art to which the invention pertains 
many modifications and adaptations will occur. 

For example, the devices may, be installed at various locations within 
the network. The invention rnay be implemented using a variety of hardware 

5 and software architectures. The teachings of the invention are applicable to 
numerous compression algorithms and compression history techniques in 
addition to those described abpve. The system and methods of the invention 
may be used to corjipress and decompress various types of data. Also, many 
techniques for identifying data types may be used. Thus, the specific 

10 structures and methods discussed in. detail above are merely illustrative of a 
few specific embodiments of the invention. 
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WHATIS CLAiMED iS:' " *^ ^ ■ ; . . 

1. A method of compressing data comprising the ste 

generating synthetic history data associated with at least one 
5 data type; 

receiving data to be compressed; * ' - o 

determining a data type of the received data; 
selecting synthetic' history data associated with ^the deterrfiihecl 
datatype; and - - - - v ^ ...^ . 

10 compressing the received data using tHe selecte'd synthetic 

history data. 

2. The method of claim 1 wherein the generating step is performed prior 
to the compressing step. 

15 

3. The method of claim 1 wherein the determining step includes analyzing 
information associated with the received data. 

4. The method of claim 3 wherein the information identifies a TCP port. 

20 

5. The method of claim 1 wherein the determining step includes analyzing 
the received data. 

6. The method of claim 1 wherein the synthetic history data includes 
25 information frequently present in data of a given data type. 

7. The method of claim 6 wherein the generating step includes 
compressing the synthetic history data. 

30 

8. The method of claim 1 wherein the generating step comprises defining . 
history data. 
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9. The method of claim 1 wherein the compressing step comprises using 
a Lempel-Ziv algorithm to compress the received data. ' ^ ' 

10. The method of claim 1 wherein the compressing step comprises using 
an adaptive Huffman algorithm to 'compress the received data'. 

1 1 . The method of claim 1 further comprising the step of associating a 
reference with compressed reteived data, wherein the reference is used to 
select synthetic history data for decompressing thei compressed received 
data. 

12. The method of claim 1 1 wherein the reference identifies a data type. 

13. A method of cbmpre'ssihg'data CO • 

receiving data to be' idbhipressed; ■ 
prior to compressing the received data, generating synthetic 
history data that is associated with the received data; and ■ 

compressing the received data using the synthetic history data. 

14. th^ method'of dsiir^ 1 t^v^fef^ih^h^ Synthetic history data Is 
associated with at leastohe d^^^ ' ' ' : ^ . ; v ; 

15. The method of claim 14 further comprising the step of detemnining a 
data type of the received data. 

16. The method of claim 15 further comprising the step of selecting the 
synthetic history data based on the determined data type. 

17. The method of clairh 15 wherein the 'detemiining step includes 
analyzing information associated with the received data. -^^ 
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t8. The method of claim 17 wherein the information comprises a TCP port. 

19. The method of claim 15 . wherein, the deterrpiping step includes 
analyzing the received data. 

20. The method of-plaim, 13, wherein the synthetic hi?tqry data includes . ^ 
information frequently present in data of a given data type. 

21. The method of claim 20; wherein the ger)^xa|r^g ^^R includes , ^ 

10 compressing the synthetic history d^ta..-..^ . ,,^|. ^. . v,: jy.:." ; . , . 

22. The method of claim 1 3 wherein the generating step includes defining 
hlstorydata. . • • ■ -:•-„■ ■;•. , ■ •;v,•^^;^^' i t h-..j- !:;..v-i "''i ..i' " 

15 23. The method of claim/1?,whereiri4he corap,rep^ 

using a Lempel-Ziv algorithm to corap.ref5.th^/9^yec|,j|aJa.., , , , 

24. The method of claim ,t3 wherein .the.^onjipje^^^^ step comprise^^^ . _ 
using an adaptive -Huffniian .algorithra to/;pn)p.f ,5s,the, rf ceived ^ 

20 

25. The method,of,Glaim,13 .further qompfi^^g^the step of as^o,pi^tin| a 
reference with compressed received data, yyl?^i^.i^_4h^.,^fff|Sn.QS.^^^ 
select synthetic history data for decompressing the compressed received 

data. ■-■ .■ . , '' .■ 'jr.: ■• iO';T''/(--'' ■ iXKii-^ ,".ot ' f^^^ C; 

26. The method of claim 25 wherein the reference is an identifier 
associated wth a data type. , ,, . . .... ..^^ . ^ .... . . > . , , - 

27. A method of decompressing data comprising the steps of: 

30 generating synthetic hi?toi;y. data, associated with at jeast one ^ ^ 

datatype; : , • ,., yr . i^.. -.^m..,^. . .. -^.^...-.v^ ^ . ..^ . ^ 

receiving data to be decompressed; 
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determining a data type of the received data; 

selecting synthetic history data associated with^ the determined 
data type; and ' • o/iz-r ::rr - 

^ decompressing the received data using the selected synthetic 
history data. ' ■ 

28. The^method of claim 2? wheirein the determining step comprises 
analyzing a reference associated with the received data. 

29. A method of decompressing:data comprising the steps;of: 

prior to decompressing the received data, generating synthetic 
history data th^t is assod^^^ received data; and. 

decompressing the received data using the synthetic history 

dataV'' • - ' \. ^ . ^ ' ■ ■ - ' ^. . - . . 

30. the method of claim 29 wherein the synthetic history data is 
associated with at least one data \^^^: ' i ^ ^ : 

31 . ^ the method of claini 30 further cdmprising the step of determining a 
data type of the received data. , . 

32. The method of claim 31 further domprising the step of selecting the 
synthe^tic history data basdd on the ■determined data type. - 

33. Thfe method of clSini 31 whefeih the determining step comprises 
analyzing a reference associated with the received data. 

34. A data compression system comprising: " 

^ ^ * a* data memory for stdring synthetic history data associated with 
at'ieaist one data type; - '' ■ ■ ' ' 

a data type identifier for detelmining a data type of received 
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data; . . . j - • : . e--- ..r^^" ' : ' - c:: 

a data selector for selecting syntl^tetic history da|ajasspciated 

with the determined data type; and , . : - 

a compressor-for .compressing the received data using the 

selected synthetic history data. ; 

35. The data compression.system/of claim34 .further rComprising, at least 
one data network interface, V v- *^: ' - -1^: v h-f;/- - nv v^ ; G 

36. A data decompressing system comprising; rsc a^^icv;-, :^ n , M;-:;n /, 

a data memory for storing, synthetic^ history; dat^ assoda,^^ with 

at le^st.one data^type;;:;;.' -^tii r;;;i>:::::'.qr-M- ni i^r^ 

a data type identifier for determin!ngf:a::datatyp^ of f^on^P^^,^^^^ 

data;> " ; i:, ■ en; f;r:,,; -v\u- 

a data selector for selecting synthetic history data associated, . 

with the determined data type; and 

a data decompressor ford^e.on)pr9§?iir]!g.the.cpm^^ 

using the selected synthetic history data^v^ r r ^ i v r-, . ''. j 

37. The data decompressionisystem of claim 36 further .Gornqrif ing .at least 

one data network interface. „.;,tr ; i ;: j : ;^ : o V : k ; -rt 

38. A computer program product comprisitigv; /: ic r ,-''o '".r- - -^v : 

a computenusable^medium haying ^cprriputer readable 
code means embodied therein for compressing received data, the 
computer readable program code means:iq;Said.qomRUt;er^^^^^^ 
comprising: • . i:--:r^/" , ^ \f 1:0" i^k-'::^^'--!- . ^^"'iv: .^n: h 

storage medium in which synthetic history data 
associated with at least one data type Js stored;, . ^ . . . v 
/ means for detemiining =a,(^at^ 

means for selecting synthetic history data associated with 
. the determined data typ.e; and r^-,.. - ; 
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means for compressing the received data using the 
selected synthetic history data. 

39. A memory for storirig data, said memory having a data structure stored 
5 therein, said data structure including the stored data and comprising: 

storage medium in Which synthetic history data 
c associated with at least one data type is stored; 

means for detbrmjhihg a data type of received data; 
means for selecting ,synthetic history data associated with 

10 : ^^"([M^e-deterrnine^data . ' V' V:" I 

: _ rrieans forxompressing the received data using the 

' selected synthetic history data. 
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